A Bayesian model for cross-study differential gene expression.

نویسندگان

  • Robert B Scharpf
  • Håkon Tjelmeland
  • Giovanni Parmigiani
  • Andrew B Nobel
چکیده

In this paper we define a hierarchical Bayesian model for microarray expression data collected from several studies and use it to identify genes that show differential expression between two conditions. Key features include shrinkage across both genes and studies, and flexible modeling that allows for interactions between platforms and the estimated effect, as well as concordant and discordant differential expression across studies. We evaluated the performance of our model in a comprehensive fashion, using both artificial data, and a "split-study" validation approach that provides an agnostic assessment of the model's behavior not only under the null hypothesis, but also under a realistic alternative. The simulation results from the artificial data demonstrate the advantages of the Bayesian model. The 1 - AUC values for the Bayesian model are roughly half of the corresponding values for a direct combination of t- and SAM-statistics. Furthermore, the simulations provide guidelines for when the Bayesian model is most likely to be useful. Most noticeably, in small studies the Bayesian model generally outperforms other methods when evaluated by AUC, FDR, and MDR across a range of simulation parameters, and this difference diminishes for larger sample sizes in the individual studies. The split-study validation illustrates appropriate shrinkage of the Bayesian model in the absence of platform-, sample-, and annotation-differences that otherwise complicate experimental data analyses. Finally, we fit our model to four breast cancer studies employing different technologies (cDNA and Affymetrix) to estimate differential expression in estrogen receptor positive tumors versus negative ones. Software and data for reproducing our analysis are publicly available.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Global gene expression analysis using microarray to study differential vulnerability to neurodegeneration

Neurodegenerative disorders such as Parkinson’s disease, motor neuron disease and Alzheimer’s disease is characterized by loss of specific cells within certain regions of the brain. One of the most compelling questions is to determine why specific cell populations are vulnerable to neurodegeneration. We addressed this question by studying global gene expression changes using an animal model of ...

متن کامل

Global gene expression analysis using microarray to study differential vulnerability to neurodegeneration

Neurodegenerative disorders such as Parkinson’s disease, motor neuron disease and Alzheimer’s disease is characterized by loss of specific cells within certain regions of the brain. One of the most compelling questions is to determine why specific cell populations are vulnerable to neurodegeneration. We addressed this question by studying global gene expression changes using an animal model of ...

متن کامل

Modification of the Fast Global K-means Using a Fuzzy Relation with Application in Microarray Data Analysis

Recognizing genes with distinctive expression levels can help in prevention, diagnosis and treatment of the diseases at the genomic level. In this paper, fast Global k-means (fast GKM) is developed for clustering the gene expression datasets. Fast GKM is a significant improvement of the k-means clustering method. It is an incremental clustering method which starts with one cluster. Iteratively ...

متن کامل

XDE: A Bayesian hierarchical model for analysis of differential expression in multiple studies

There are many publicly available high throughput gene expression studies that address comparable biological questions with similar patient populations. For economical and practical reasons many of these studies have a relatively small number of biological replicates. To improve the statistical power it is of interest to combine observed data from several microarray studies, potentially measure...

متن کامل

Differential Expression of Alpha S1 Casein and Beta-Lactoglobulin Genes at Different Physiological stages of the Adani Goats Mammary Glands

Background: Milk proteins genes have been the focus of the researches as the candidate target genes that play a decisive role when animal breeding is desired.Objectives: In the present study, the transcriptional levels of Beta-lactoglobulin (BLG) and Alpha S1 casein (CSN1S1) genes were investigated during prenatal, milking and drying times in mammary glands of the Adani goats which showed...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Journal of the American Statistical Association

دوره 104 488  شماره 

صفحات  -

تاریخ انتشار 2009